Systems and methods for generating supplemental content for media content

ABSTRACT

Systems and methods are disclosed herein for generating supplemental content for media content. One disclosed technique herein generates for display a page of an electronic book. A noun, and a word contextually related to the noun, are identified from the displayed page of the electronic book. Content structures are searched for a content structure that includes a matching object having an object name matching the noun. The content structure includes objects, where each object has attribute table entries. Upon finding an identified attribute table entry of the matching object that matches the related word, a new content structure is generated. The new content structure includes the matching object and the identified attribute table entry. A content segment is generated for output (e.g., for display on the electronic book) based on the new content structure.

BACKGROUND

The present disclosure is directed to techniques for generatingsupplemental content in relation to media content, and more particularlyto techniques for generating supplemental content for electronic books.

SUMMARY

Electronic devices commonly allow a user to access and read anelectronic book. While the user is reading the electronic book, however,none or very little contextual information is provided to the user. Inone approach, the electronic book may include hyperlinks for some of thewords or phrases on the current page. The electronic book may alsoinclude pre-rendered static illustrations to accompany the text.Moreover, these electronic books lack a dynamic and automatic ability togenerate and present supplemental content (e.g., audio, video, images)that relate to the text shown by the electronic book.

Accordingly, techniques are disclosed herein for generating andpresenting real-time supplemental content for an electronic book. Insome embodiments, the techniques generate and present supplementalcontent by altering contextual aspects of existing content structures togenerate new content segments for output in electronic books.

One disclosed technique herein generates display of a page for anelectronic book. A noun, and a related word contextually related to thenoun, are identified from the displayed page of the electronic book.Using the identified noun, the system searches content structures for acontent structure that includes a matching object having an object namematching the noun identified from the electronic book. The contentstructures include objects, where each object have attribute tableentries. Upon finding an identified attribute table entry of thematching object that matches the related word, a new content structureis generated. The new content structure includes the matching object andthe identified attribute table entry. A content segment is generated foroutput (e.g., for display on the electronic book) based on the newcontent structure. Exemplary content structures that can be used forgenerating new content structures and rendered into a content segmentare described by co-pending application Ser. No. 16/363,919 entitled“SYSTEMS AND METHODS FOR CREATING CUSTOMIZED CONTENT,” filed on Mar. 25,2019, which is hereby expressly incorporated by reference herein in itsentirety.

Various techniques are disclosed herein when the identified attributetable entries of the matching object do not match the related word. Onedisclosed technique determines an approximate attribute table entrywhich (approximately) matches the related word. A new content structureis generated including the matching object and the approximate attributetable entry. A content segment is generated for output (e.g., fordisplay on the electronic book) based on the new content structure.Another disclosed technique provides for generating a new contentstructure comprising the matching object excluding the non-matchingattribute table entry. A content segment is generated based on the newcontent structure which excludes the non-matching attribute table entry.

In some embodiments, the content structures further include virtualmodelling data (e.g., vectoring data) for the objects and attributetable entries. The generated content segment includes determiningmatching virtual modelling data of the matching object including theidentified attribute table entry. The content segment is rendered (e.g.,a 3D animation) and generated for output based on the matching virtualmodelling data. In other embodiments, the content segment may beoutputted within the margins of the page of the electronic book.Exemplary content structures utilizing virtual modelling data areprovided in co-pending application Ser. No. 16/451,823 entitled “SYSTEMSAND METHODS FOR CREATING CUSTOMIZED CONTENT,” filed on Jun. 25, 2019,which is hereby expressly incorporated by reference herein in itsentirety.

There are numerous techniques for determining an output duration of thegenerated content segment disclosed herein. One technique disclosedprovides for determining first and second reading locations on the page(e.g., via optical sensor of a device) of the electronic book at firstand second time stamps. The second time stamp occurs after the firsttime stamp, and the amount of text between the first and second readinglocations is determined. An average reading speed value is thendetermined using the amount of text, and a difference between the firsttime stamp and the second time stamp. The content segment generated foroutput is for a duration based on the determined average reading speedvalue. The output may be extended upon determination that the currentreading location on the page of the electronic book matches the locationof the outputted content segment.

BRIEF DESCRIPTION OF THE DRAWINGS

The below and other objects and advantages of the disclosure will beapparent upon consideration of the following detailed description, takenin conjunction with the accompanying drawings, in which like referencecharacters refer to like parts throughout, and in which:

FIG. 1A shows an illustrative diagram for an electronic book displayedon a tablet device, in accordance with some embodiments of thedisclosure;

FIG. 1B shows an illustrative diagram of an exemplary content structure,in accordance with some embodiments of the disclosure;

FIG. 1C shows an illustrative diagram for specific attribute table entrytable selections from a content structure, in accordance with someembodiments of the disclosure;

FIG. 1D shows an illustrative diagram for generating supplementalcontent for media content on the tablet device, in accordance with someembodiments of the disclosure;

FIG. 2 shows an illustrative data flow diagram including a device, alinguistics processing engine, and a construction engine, in accordancewith some embodiments of the disclosure;

FIG. 3 shows an illustrative system diagram of the linguisticsprocessing engine, the content structure, the construction engine, thecontent segment, and devices, in accordance with some embodiments of thedisclosure;

FIG. 4A shows an illustrative block diagram of the linguisticsprocessing engine, in accordance with some embodiments of thedisclosure;

FIG. 4B shows an illustrative block diagram of the construction engine,in accordance with some embodiments of the disclosure;

FIG. 5 is an illustrative flowchart of a process for generatingsupplemental content for media content, in accordance with someembodiments of the disclosure; and

FIG. 6 is an illustrative flowchart of a process for outputting acontent segment for an output duration based on an average reading speedvalue, in accordance with some embodiments of the disclosure.

DETAILED DESCRIPTION

FIG. 1A shows an illustrative diagram 100 for an electronic bookdisplayed on a tablet device, in accordance with some embodiments of thedisclosure. An electronic tablet device 102 includes a linguisticsprocessing engine 110 (local and/or remote) that may generate displayfor an electronic book 103. In this example, the electronic bookincludes the following text: “Jennifer was eating sushi from herfavorite restaurant one block away having the most famous Californiarolls in the city.”

The linguistics processing engine may analyze the text from theelectronic book. In particular, the analysis may include theclassification of words from the text as nouns and contextualinformation related to respective nouns. The linguistics processingengine may identify a noun from the displayed page using varioustechniques. In some embodiments, the linguistics processing enginereceives metadata of the electronic book from a device displaying theelectronic book and/or an Internet service providing the electronic bookas electronic content. The metadata may include all text within theelectronic book and/or other text-related information about theelectronic book. In other embodiments, the linguistics processing enginemay implement Optical Character Recognition techniques to parse thewords from the electronic book. Continuing from the above example, thelinguistics processing engine may identify the word “sushi” 106 as thenoun from the displayed page. Even though “Jennifer” 104 is a noun, thelinguistics processing engine selected “sushi” as the noun for analysis.The identification of a particular noun from a plurality of nouns thatare displayed may be based on a variety of techniques disclosed herein.In some embodiments, the noun identified may match a relevance scorebased on analysis of the entire text on the displayed page. In thisembodiment, the entire text is analyzed for relevance based on usermetadata. Nouns may be analyzed with one or more elements of usermetadata based on relevance algorithms to generate a relevancy score.The highest score may be identified as the noun for identification fromthe displayed page. In other embodiments, relevance may be based ontextual analysis of all words displayed on a page displayed by theelectronic book using relevance algorithms known to a person of ordinaryskill in the art. In yet other embodiments, relevance may be based ontextual analysis of the words of at least a portion of the entireelectronic book, using relevance algorithms known to a person ofordinary skill in the art.

The linguistics processing engine may identify a related word from thedisplayed page that is contextually related to the noun. Continuing fromthe above example, the words “Jennifer” 104, “restaurant one block away”108, and “California rolls” 109 are identified as contextually relatedto the noun “sushi.” The linguistics processing engine may identifyrelated words from the displayed page as contextual based on variousrelevance algorithms. In some embodiments, all words within thedisplayed page are queried in a data structure with the identified noun.The data structure provides a relevancy score between the words relativeto the identified noun. The words with the highest relevancy score maybe identified as related words.

FIG. 1B shows an illustrative diagram 111 of an exemplary contentstructure, in accordance with some embodiments of the disclosure. Thelinguistics processing engine 110 interfaces with a content structure133, which includes an attribute table 131 and mapping 132.Specifically, attribute table 131 may include object data structures 134including attributes relating to an object. The object data structure134 includes attribute table entries such as a descriptive structure135, an action structure 136, an audio structure, etc. Attribute tableentries may be attributes, states, actions, or other types ofsubstructure within the attribute table. The descriptive structure 135lists attributes such as object name, object type (e.g., sushi, tacos,spaghetti, samosas, etc.), features 135 a (e.g., for sushi object type:California roll, Philadelphia roll, sashimi, maki, nigiri, etc.), states135 b (e.g., in chopsticks, for presentation, partial eaten, pre-cut,etc.). The features 135 a and states 135 b may include differentattributes based on the object type. The states 135 b may include anemotional state, a motion state (e.g., inanimate, picked up with fork,picked up with chopsticks, picked up with hands.

The action structure 136 is descriptive of actions that the object isperforming on or to other objects. The action structure 136 lists actionname/type 136 a (e.g., being prepared, being eaten, being eaten by Jon,being eaten by Jennifer, etc.), object(s) that the action involves,absolute location 136 b of the object with respect to the video frame,relative location 136 c relative to other object(s), absolute motion 136e, relative motion 136 f, etc. The mapping 132 b corresponding to theaction attribute 136 a may include a value indicative of a rate or adegree at which the action in taking place (e.g., eaten “slowly,”“feverishly,” “quickly,” etc.).

Similarly, mapping 132 further shows action mapping 136 ₄ absolutelocation mappings 136 b ₁₋₂, relative location mappings 215 a, 217 a,217 b and 218 a, absolute motion mapping 136 e ₁, relative motionmapping 136 f ₁₋₄, setting mappings, and setting feature mappings. Insome embodiments, the mapping may be temporal, locational, or othervalue-based values corresponding to a specific objection, action, state,or attribute. In some embodiments, the mapping may be independent of thespecific objection, action, state, or attribute. For example, themapping may be of a general phenomenon independent of a correspondingobject/action. Instead, any object within the proximity of thatphenomenon may receive the respective mapping.

As previously mentioned, exemplary content structures that can be usedfor generating new content structures and rendered into a contentsegment are described by co-pending application Ser. No. 16/363,919entitled “SYSTEMS AND METHODS FOR CREATING CUSTOMIZED CONTENT,” filed onMar. 25, 2019, which is hereby expressly incorporated by referenceherein in its entirety.

FIG. 1C shows an illustrative diagram 121 for specific attribute tableselections from a content structure, in accordance with some embodimentsof the disclosure. A construction engine 150 may search a plurality ofcontent structures for a content structure that comprises a matchingobject with an object name that matches the noun. Each of the contentstructures may include one or more objects. Each of the objects mayinclude one or more attributes. Continuing from the above example, thecontent structure 133 includes an object 111 with the object name“sushi.” The attribute table 131 includes attribute table entries,namely, an attribute that the sushi is “California roll” 135 a ₁, state“in chopsticks” 135 a _(n), and actions “being eaten by Jennifer” 135 b₁ and “presented for eating” 135 b _(n). The identified related wordsinclude California roll, which matches the attribute table entry“California roll” 135 a ₁. Additionally, the identified related word“Jennifer” matches the attribute table entry “being eaten by Jennifer”135 b ₁.

The construction engine, in response to identifying an attribute tableentry of the matching object that matches the related word, may generatea new content structure comprising the matching object. The matchingobject comprises the identified attribute table entry. Continuing fromthe above example, the matching object would be object 111 with matchingattribute table entries “California roll” and “being eaten by Jennifer.”

The construction engine, in response to identifying an attribute tableentry of the matching object that does not match the related word, maydetermine an approximate attribute table entry of the matching objectthat matches the related word and generate for output a content segmentbased on the new content structure. Continuing from the above example,the matching object would be object 111 but may not have the entry“California roll,” but instead has the entry “Philadelphia roll.” Theconstruction engine may determine that “Philadelphia roll” is anapproximate attribute table entry of the matching object “sushi” thatmatches the related word “California roll.” The approximate matching maybe based on lexical similarity, contextual similarity of worddefinitions, similarity of aggregate or singular user selection historyfor both terms, or other similarity algorithms known to a person ofordinary skill in the art.

The construction engine, in response to identifying an attribute tableentry of the matching object that does not match the related word, maygenerate a new content structure including the matching object butexcluding the non-matching attribute table entry, and generate foroutput a content segment based on the new content structure. Continuingfrom the above example, the matching object (i.e., object 111) wouldexclude the attribute table entry of “presented for eating” 135 b _(n).

FIG. 1D shows an illustrative diagram 131 for generating supplementalcontent for media content on the tablet device, in accordance with someembodiments of the disclosure. The construction engine may generate foroutput a content segment based on the new content structure. Continuingfrom the above example, object 111 with matching attribute table entries“California Roll” and “being eaten by Jennifer” has correspondingmappings from the content structure. For example, mappings for attributetable entry (action) “being eaten by Jennifer” may provide for temporalvalues (e.g., 135 b ₁) for a specific animation for the action; namelythat the animation of this content segment may be output for 30 seconds.As mentioned previously, further details relating to the creation andorganization of content structures are provided in co-pending andcommonly assigned U.S. nonprovisional application Ser. No. 16/363,919entitled “SYSTEMS AND METHODS FOR CREATING CUSTOMIZED CONTENT,” filed onMar. 25, 2019, which is hereby expressly incorporated by referenceherein in its entirety.

In some embodiments, the construction engine may generate for output acontent segment based on the new content structure within the margins ofthe page of the electronic book. In other embodiments, the constructionengine may generate for output a content segment based on the newcontent structure behind at least a portion of the displayed text of thepage of the electronic book by configuring an amount of opacity to thecontent segment. In yet other embodiments, the content segment mayinclude only an audio output to provide contextual ambiance to the textdisplayed on the page. For example, if the scene takes place in a café,the audio output may provide for ambient café atmospheric noiseincluding temporal mapping values that last for the duration of the texttaking place in the café.

One of the disclosed techniques provides for content structuresincluding virtual modelling data for the objects and attribute tableentries. The construction engine generates a content segment for outputby determining matching virtual modelling data of the matching object.The matching object may include the identified attribute table entry.The construction engine renders (and generates for output) the contentsegment based on the matching virtual modelling data. The virtualmodelling data may be any type of data that provides for information forcreation of at least one of 2D animation, 3D animation, holographicrepresentation, avatar-based modelling, or representations produced fromartificial intelligence generation engines. Continuing from the aboveexample, “being eaten by Jennifer” and the “California Roll” identifiedattribute table entries may have vectoring information corresponding toa positional 3D mapping in x-y-z coordinate space. As mentioned earlier,exemplary content structures utilizing virtual modelling data areprovided in No. 16/451,823 entitled “SYSTEMS AND METHODS FOR CREATINGCUSTOMIZED CONTENT,” filed on Jun. 25, 2019, which is hereby expresslyincorporated by reference herein in its entirety, which is herebyexpressly incorporated by reference herein in its entirety. Based onthis corresponding vectoring information, a 3D animation of Jennifereating sushi is generated as a content segment for output for theelectronic book.

There are numerous techniques for determining an output duration of thegenerated content segment disclosed herein. One technique disclosedprovides for the linguistics processing engine determining first andsecond reading locations on the page (e.g., via optical sensor of adevice) of the electronic book at first and second time stamps. Thelinguistics processing engine then determines the amount of text betweenthe first reading location and the second reading location. An averagereading speed value is determined by the linguistics processing enginebased on the amount of text, and a difference between the first timestamp and the second time stamp. The output of the content segment isbased on an output duration, where the length of the output duration isbased on the determined average reading speed value. Continuing from theabove example, the tablet device that is displaying the electronic bookhas an embedded front-mounted camera which can detect the viewing angleof the user. Based on the viewing angle measured by the embeddedfront-mounted camera, the linguistics processing engine determines afirst and second viewing angle as the user reads consecutive lines ofthe electronic book. An average reading speed value is calculated as 200words per minute. Based on this value, the content segment of Jennifereating the California Roll will be output for approximately 30 seconds(100 words, equivalent to 30 seconds), as the contextual informationabout Jennifer eating the California Roll spans approximately threelines comprising 100 words. In some embodiments, the content segmentsare output in real time as the optical sensor (or similar) detects thespecific text that is being viewed by the user.

In some embodiments, in response to a determination that the currentreading location on the page of the electronic book matches the locationof the outputted content segment, the output duration is extended. Forexample, if the system determines via the optical sensor that the useris viewing the 3D animation of Jennifer eating the California Roll, thisanimation will be extended indefinitely as long as the user continues toview the animation.

In some embodiments, the linguistics processing engine may determine aviewing angle of a user on the displayed page of the electronic book viaan optical sensor. Based on the optical sensor, the linguisticsprocessing engine may determine a particular line of text on thedisplayed page of the electronic book. For example, the optical sensormay be embedded in a device generating for display the page of theelectronic book (e.g., tablet device, television, computer, smartphone),or a wearable device (e.g., smart-glasses, smart jewelry, etc.)

FIG. 2 shows an illustrative data flow diagram 200 including a device, alinguistics processing engine, and a construction engine, in accordancewith some embodiments of the disclosure. The linguistics processingengine 204 generates display of a page 210 from an electronic book for adevice 202. The linguistics processing engine 204 identifies a noun fromthe displayed page and then identifies a related word from the displayedpage that is contextually related to the noun. The linguisticsprocessing engine 204 transmits information to the construction engine206 to search 212 a plurality of content structures for a contentstructure that includes a matching object with an object name thatmatches the noun. The construction engine 206 identifies an attributetable entry of the matching object that matches the related word. Theconstruction engine 206 generates a new content structure including thematching object and the identified attribute table entry. Theconstruction engine 206 generates 214 a content segment based on the newcontent structure and transmits the content segment to the device 202.

FIG. 3 shows an illustrative system diagram 300 of the linguisticsprocessing engine, the content structure, the construction engine, thecontent segment, and devices, in accordance with some embodiments of thedisclosure. The linguistics processing engine 302 may be of any hardwarethat provides for processing and transmit/receive functionality. Thelinguistics processing engine may be communicatively coupled to multipleelectronic devices (e.g., device 1 (306), device 2 (307), and device n(309)). The linguistics processing engine may be communicatively coupledto a content structure 310, a construction engine 304, and contentsegment 308. As illustrated within FIG. 3 , a further detaileddisclosure on the linguistics processing engine can be seen in FIG. 4Ashowing an illustrative block diagram of the linguistics processingengine, in accordance with some embodiments of the disclosure.Additionally, as illustrated within FIG. 3 , a further detaileddisclosure on the construction engine can be seen in FIG. 4B showing anillustrative block diagram of the construction engine, in accordancewith some embodiments of the disclosure.

In some embodiments, the linguistics processing engine may beimplemented remote from the devices 306-309 such as from a cloud serverconfiguration. The linguistics processing engine may be any device forretrieving information from the devices 306-309 and identifying and/orparsing textual and other information from media content played ondevices 306-309. The linguistics processing engine may be implemented bya television, a Smart TV, a set-top box, an integrated receiver decoder(IRD) for handling satellite television, a digital storage device, adigital media receiver (DMR), a digital media adapter (DMA), a streamingmedia device, a DVD player, a DVD recorder, a connected DVD, a localmedia server, a BLU-RAY player, a BLU-RAY recorder, a personal computer(PC), a laptop computer, a tablet computer, a WebTV box, a personalcomputer television (PC/TV), a PC media server, a PC media center, ahandheld computer, a stationary telephone, a personal digital assistant(PDA), a mobile telephone, a portable video player, a portable musicplayer, a portable gaming machine, a smart phone, or any othertelevision equipment, computing equipment, Internet-of-Things device,wearable device, or wireless device, and/or combination of the same. Anyof the system modules (e.g., linguistics processing engine, datastructure, ISP, and electronic devices) may be any combination of sharedor disparate hardware pieces that are communicatively coupled.

In some embodiments, the construction engine may be implemented remotefrom the electronic devices 306-309 such as a cloud serverconfiguration. The construction engine may be any device for accessingthe content structure and generating content segments as describedabove. The construction may be implemented by a television, a Smart TV,a set-top box, an integrated receiver decoder (IRD) for handlingsatellite television, a digital storage device, a digital media receiver(DMR), a digital media adapter (DMA), a streaming media device, a DVDplayer, a DVD recorder, a connected DVD, a local media server, a BLU-RAYplayer, a BLU-RAY recorder, a personal computer (PC), a laptop computer,a tablet computer, a WebTV box, a personal computer television (PC/TV),a PC media server, a PC media center, a handheld computer, a stationarytelephone, a personal digital assistant (PDA), a mobile telephone, aportable video player, a portable music player, a portable gamingmachine, a smart phone, or any other television equipment, computingequipment, Internet-of-Things device, wearable device, or wirelessdevice, and/or combination of the same. Any of the system modules (e.g.,linguistics processing engine, data structure, ISP, and electronicdevices) may be any combination of shared or disparate hardware piecesthat are communicatively coupled.

In some embodiments, the linguistics processing engine, constructionengine, and a device from devices 306-309 may be implemented within asingle local device. In other embodiments, the linguistics processingengine and construction engine may be implemented within a single localdevice.

The electronic devices (e.g., device 1 (306), device 2 (307), device n(309)) may be any device that has properties to transmit/receive networkdata as well as an interface to play back media content (e.g.,touchscreen, speakers, keyboard, voice command input and confirmation,or any other similar interfaces). The devices 306-309 may be implementedby a television, a Smart TV, a set-top box, an integrated receiverdecoder (IRD) for handling satellite television, a digital storagedevice, a digital media receiver (DMR), a digital media adapter (DMA), astreaming media device, a DVD player, a DVD recorder, a connected DVD, alocal media server, a BLU-RAY player, a BLU-RAY recorder, a personalcomputer (PC), a laptop computer, a tablet computer, a WebTV box, apersonal computer television (PC/TV), a PC media server, a PC mediacenter, a handheld computer, a stationary telephone, a personal digitalassistant (PDA), a mobile telephone, a portable video player, a portablemusic player, a portable gaming machine, a smart phone, or any othertelevision equipment, computing equipment, Internet-of-Things device,wearable device, or wireless device, and/or combination of the same.

The content structure 310 may be any database, server, or computingdevice that contains memory for receiving and transmitting data relatedto the attribute table 314 and mapping 312. Example data that may bestored in the content structure, as described earlier, can be seen inFIG. 1B. The content structure may be cloud based, integrated into thelinguistics processing engine, construction engine, and/or integratedinto one of the devices 307-309. In some embodiments, the contentstructure is communicatively coupled to both the linguistics processingengine 302 and the construction engine 312.

The content segment 308 may be any data or information that is generatedby the construction server 304. The content segment may be transmittedby the construction server 304 to any of the devices 306-309. Thecontent segment may be communicatively coupled to the devices 306-309,the construction engine 304, and the linguistics processing engine 302.

FIG. 4A shows an illustrative block diagram 400 of the linguisticsprocessing engine, in accordance with some embodiments of thedisclosure. In some embodiments, the linguistics processing engine maybe communicatively connected to a user interface. In some embodiments,the linguistics processing engine may include processing circuitry,control circuitry, and storage (e.g., RAM, ROM, hard disk, removabledisk, etc.). The linguistics processing engine may include aninput/output path 406. I/O path 406 may provide device information, orother data, over a local area network (LAN) or wide area network (WAN),and/or other content and data to control circuitry 404, that includesprocessing circuitry 408 and storage 410. Control circuitry 404 may beused to send and receive commands, requests, signals (digital andanalog), and other suitable data using I/O path 406. I/O path 406 mayconnect control circuitry 404 (and specifically processing circuitry408) to one or more communications paths.

Control circuitry 404 may be based on any suitable processing circuitrysuch as processing circuitry 408. As referred to herein, processingcircuitry should be understood to mean circuitry based on one or moremicroprocessors, microcontrollers, digital signal processors,programmable logic devices, field-programmable gate arrays (FPGAs),application-specific integrated circuits (ASICs), etc., and may includea multi-core processor (e.g., dual-core, quad-core, hexa-core, or anysuitable number of cores) or supercomputer. In some embodiments,processing circuitry may be distributed across multiple separateprocessors or processing units, for example, multiple of the same typeof processing units (e.g., two Intel Core i7 processors) or multipledifferent processors (e.g., an Intel Core i5 processor and an Intel Corei7 processor). In some embodiments, control circuitry 404 executesinstructions for a linguistics processing engine stored in memory (e.g.,storage 410).

Memory may be an electronic storage device provided as storage 410,which is part of control circuitry 404. As referred to herein, thephrase “electronic storage device” or “storage device” should beunderstood to mean any device for storing electronic data, computersoftware, or firmware, such as random-access memory, read-only memory,hard drives, solid state devices, quantum storage devices, or any othersuitable fixed or removable storage devices, and/or any combination ofthe same. Nonvolatile memory may also be used (e.g., to launch a boot-uproutine and other instructions).

The linguistics processing engine 402 may be coupled to a communicationsnetwork. The communication network may be one or more networks includingthe Internet, a mobile phone network, mobile voice or data network(e.g., a 5G, 4G or LTE network), mesh network, peer-to-peer network,cable network, or other types of communications network or combinationsof communications networks. The linguistics processing engine may becoupled to a secondary communication network (e.g., Bluetooth, NearField Communication, service provider proprietary networks, or wiredconnection) to the selected device for generation for playback. Pathsmay separately or together include one or more communications paths,such as a satellite path, a fiber-optic path, a cable path, a path thatsupports Internet communications, free-space connections (e.g., forbroadcast or other wireless signals), or any other suitable wired orwireless communications path or combination of such paths.

FIG. 4B shows an illustrative block diagram 401 of the constructionengine, in accordance with some embodiments of the disclosure. Theconstruction engine may perform each of the operations individually orcollaboratively. In some embodiments, the construction engine may becommunicatively connected to a user interface. In some embodiments, theconstruction engine may include processing circuitry, control circuitry,and storage (e.g., RAM, ROM, hard disk, removable disk, etc.). Theconstruction engine may include an input/output path 406. Theconstruction engine may be coupled to a communications network.

FIG. 5 is an illustrative flowchart of a process 500 for generatingsupplemental content for media content, in accordance with someembodiments of the disclosure. Process 500, and any of the followingprocesses, may be executed by control circuitry 404 (e.g., in a mannerinstructed to control circuitry 404 by the linguistics processing engine402 and/or construction engine 412). Control circuitry 404 may be partof a network optimizer, or of a remote server separated from the networkoptimizer by way of a communication network, or distributed over acombination of both.

At 502, the linguistics processing engine 302, by control circuitry 404,generates for display a page of an electronic book. In some embodiments,the linguistics processing engine, subsequent to generating for displaya page of an electronic book, transmits this information via the I/Opath 406 to a device from devices 306-309.

At 504, the linguistics processing engine 302, by control circuitry 404,identifies a noun from the displayed page. In some embodiments, theidentification of the noun from the displayed page from a device 306-309is performed, at least in part, by processing circuitry 408.

At 506, the linguistics processing engine 302, by control circuitry 404,identifies a related word from the displayed page that is contextuallyrelated to the noun. In some embodiments, the identification of therelated word from the displayed page of a device 306-309 that iscontextually related to the noun is performed, at least in part, byprocessing circuitry 408.

At 508, the construction engine 304, by control circuitry 404, searchesa plurality of content structures for a content structure that includesa matching object with an object name that matches the noun. In someembodiments, the searching of a plurality of content structures isperformed by the construction engine 304 transmitting requests via theI/O path 406 to the content structure 310. In some embodiments, thematching performed by the construction engine 304 is performed, at leastin part, by processing circuitry 408.

At 510, the construction engine 304, by control circuitry 404,determines whether an attribute table entry of the matching object hasbeen identified that matches the related word. In one embodiment, if, at510, control circuitry determines “No,” the attribute table entry of thematching object has not been identified that matches the related word,the process advances to 514. At 514, the construction engine 304, bycontrol circuitry 404, determines an approximate attribute of thematching object that matches the related word.

At 516, the construction engine 304, by control circuitry 404, generatesa new content structure comprising the matching object. The matchingobject comprises the approximate attribute.

In another embodiment, if, at 510, control circuitry determines “No,”the attribute table entry of the matching object has not been identifiedthat matches the related word, the process advances to 518. At 518, theconstruction engine 304, by control circuitry 404, generates a newcontent structure comprising the matching object excluding thenon-matching attribute.

If, at 510, control circuitry determines “Yes,” the attribute tableentry of the matching object has been identified that matches therelated word, the process advances to 512. At 512, the constructionengine 304, by control circuitry 404, generates a new content structurecomprising the matching object. The matching object comprises theidentified attribute.

At 520, the construction engine 304, by control circuitry 404, generatesfor output a content segment based on the new content structure. In someembodiments, the construction engine 304, by control circuitry 404, maytransmit the content segment via the I/O path 406 to a device 306-309.

FIG. 6 is an illustrative flowchart of a process 600 for outputting acontent segment for an output duration based on an average reading speedvalue, in accordance with some embodiments of the disclosure. At 602,the linguistics processing engine 302, by control circuitry 404,determines a first reading location at a first time stamp on the page ofthe electronic book. In some embodiments, the linguistics processingengine 302 receives information from the devices 306-309 via the I/Opath 406 (e.g., an optical sensor on the devices providing viewing angleinformation to the linguistics processing engine).

At 604, the linguistics processing engine 302, by control circuitry 404,determines a second reading location at a second time stamp on the pageof the electronic book. The second time stamp occurs subsequent to thefirst time stamp. In some embodiments, the linguistics processing engine302 receives information from the devices 306-309 via the I/O path 406(e.g., an optical sensor on the devices providing viewing angleinformation to the linguistic processing engine).

At 606, the linguistics processing engine 302, by control circuitry 404,determines the amount of text between the first reading location and thesecond reading location. In some embodiments, the determination of theamount of text between the first reading location and the second readinglocation by the linguistics processing engine 302 is performed, at leastin part, by processing circuitry 408.

At 608, the linguistics processing engine 302, by control circuitry 404,determines an average reading speed value based on the amount of text,and a difference between the first time stamp and the second time stamp.In some embodiments, the determination of the average reading speedvalue by the linguistics processing engine 302 is performed, at least inpart, by processing circuitry 408.

At 610, the construction engine 304, by control circuitry 404, outputsthe content segment for an output duration based on the average readingspeed value. In some embodiments, the construction engine 304, bycontrol circuitry 404, may transmit the content segment via the I/O path406 to a device 306-309.

It is contemplated that the steps or descriptions of FIGS. 5-6 may beused with any other embodiment of this disclosure. In addition, thesteps and descriptions described in relation to FIGS. 5-6 may be done inalternative orders or in parallel to further the purposes of thisdisclosure. For example, each of these steps may be performed in anyorder or in parallel or substantially simultaneously to reduce lag orincrease the speed of the system or method. Any of these steps may alsobe skipped or omitted from the process. Furthermore, it should be notedthat any of the devices or equipment discussed in relation to FIGS. 3,4A, and 4B could be used to perform one or more of the steps in FIGS.5-6 .

The processes discussed above are intended to be illustrative and notlimiting. One skilled in the art would appreciate that the steps of theprocesses discussed herein may be omitted, modified, combined, and/orrearranged, and any additional steps may be performed without departingfrom the scope of the invention. More generally, the above disclosure ismeant to be exemplary and not limiting. Only the claims that follow aremeant to set bounds as to what the present invention includes.Furthermore, it should be noted that the features and limitationsdescribed in any one embodiment may be applied to any other embodimentherein, and flowcharts or examples relating to one embodiment may becombined with any other embodiment in a suitable manner, done indifferent orders, or done in parallel. In addition, the systems andmethods described herein may be performed in real time. It should alsobe noted that the systems and/or methods described above may be appliedto, or used in accordance with, other systems and/or methods.

1-30. (canceled)
 31. A method comprising: generating for display aportion of an electronic book; identifying a setting of a scenereferenced by text from the portion of the electronic book; identifyinga time period for which the setting applies; accessing a database toretrieve audio data identified by the database as relevant to thesetting; and generating an output based on the retrieved audio data forduration of the identified time period.
 32. The method of claim 31,wherein the identifying the setting of the scene comprises: identifyinga noun in the portion of the electronic book that is currentlydisplayed; retrieving, with a linguistics processing engine, a relevancescore of the noun with respect to a potential setting; and in responseto the relevance score exceeding a threshold, identifying the potentialsetting as relevant to the portion of the electronic book.
 33. Themethod of claim 32, wherein the identifying the noun comprises:receiving metadata of the electronic book from a service providing theelectronic book as content; analyzing the portion of the electronic bookbased on the metadata to generate a plurality of relevancy scoresincluding a relevancy score for each noun in the portion of theelectronic book; and identifying the noun with a highest relevancy scoreof the plurality of relevancy scores.
 34. The method of claim 31,wherein the accessing the database to retrieve the audio data comprises:searching an attribute table for an attribute table entry related to thesetting; and retrieving an audio data structure from the attribute tableentry, and wherein the generating the output comprises generating, froma list of attributes in the audio data structure, an audio segmentrelated to the attributes.
 35. The method of claim 34, wherein theattribute table includes one or more object data structures.
 36. Themethod of claim 35, wherein each of the one or more object datastructure comprises one or more attribute table entries, and whereineach attribute table entry of the one or more attribute table entriesincludes a feature, and a temporal mapping indicating a time period towhich the feature applies.
 37. The method of claim 34, wherein theattribute table is mapped against a plurality of time stamps.
 38. Themethod of claim 31, wherein the identifying the time period for whichthe setting applies comprises: at a first time, identifying a firstreading location in the electronic book; at a second time, identifying asecond reading location in the electronic book; calculating an averagereading speed value based on: (a) an amount of text between the firstand second reading locations, and (b) a time difference between thefirst time and the second time; and identifying the time period based onthe determined average reading speed value and an amount of text in theelectronic book relevant to the setting.
 39. The method of claim 38,wherein the identifying the first reading location comprises:determining a first viewing angle based on a first image of a usercaptured by a camera of the electronic book; and identifying the firstreading location in the electronic book from the first viewing angle,and wherein the identifying the second reading location comprises:determining a second viewing angle based on a second image of the usercaptured by the camera of the electronic book; and identifying thesecond reading location in the electronic book from the second viewingangle.
 40. The method of claim 39, wherein the first viewing angle iscalculated from a first location of eyes of the user at the first timeand a location of the camera in relation to the electronic book, andwherein the second viewing angle is calculated from the location of eyesof the user at the second time and the location of the camera inrelation to the electronic book.
 41. A system comprising: circuitryconfigured to: generate for display a portion of an electronic book;identify a setting of a scene referenced by text from the portion of theelectronic book; identify a time period for which the setting applies;access a database to retrieve audio data identified by the database asrelevant to the setting; and generate an output based on the retrievedaudio data for duration of the identified time period.
 42. The system ofclaim 41, wherein the circuitry configured to identify the setting ofthe scene is configured to: identify a noun in the portion of theelectronic book that is currently displayed; retrieve, with alinguistics processing engine, a relevance score of the noun withrespect to a potential setting; and in response to the relevance scoreexceeding a threshold, identify the potential setting as relevant to theportion of the electronic book.
 43. The system of claim 42, wherein thecircuitry configured to identify the noun is configured to: receivemetadata of the electronic book from a service providing the electronicbook as content; analyze the portion of the electronic book based on themetadata to generate a plurality of relevancy scores including arelevancy score for each noun in the portion of the electronic book; andidentify the noun with a highest relevancy score of the plurality ofrelevancy scores.
 44. The system of claim 41, wherein the circuitryconfigured to access the database to retrieve the audio data isconfigured to: search an attribute table for an attribute table entryrelated to the setting; and retrieve an audio data structure from theattribute table entry, and wherein the circuitry configured to generatethe output is configured to generate, from a list of attributes in theaudio data structure, an audio segment related to the attributes. 45.The system of claim 44, wherein the attribute table includes one or moreobject data structures.
 46. The system of claim 45, wherein each of theone or more object data structure comprises one or more attribute tableentries, and wherein each attribute table entry of the one or moreattribute table entries includes a feature, and a temporal mappingindicating a time period to which the feature applies.
 47. The system ofclaim 44, wherein the attribute table is mapped against a plurality oftime stamps.
 48. The system of claim 41, wherein the circuitryconfigured to identify the time period for which the setting applies isconfigured to: at a first time, identify a first reading location in theelectronic book; at a second time, identify a second reading location inthe electronic book; calculate an average reading speed value based on:(a) an amount of text between the first and second reading locations,and (b) a time difference between the first time and the second time;and identify the time period based on the determined average readingspeed value and an amount of text in the electronic book relevant to thesetting.
 49. The system of claim 48, wherein the circuitry configured toidentify the first reading location is configured to: determine a firstviewing angle based on a first image of a user captured by a camera ofthe electronic book; and identify the first reading location in theelectronic book from the first viewing angle, and wherein the circuitryconfigured to identify the second reading location is configured to:determine a second viewing angle based on a second image of the usercaptured by the camera of the electronic book; and identify the secondreading location in the electronic book from the second viewing angle.50. The system of claim 49, wherein the first viewing angle iscalculated from a first location of eyes of the user at the first timeand a location of the camera in relation to the electronic book, andwherein the second viewing angle is calculated from the location of eyesof the user at the second time and the location of the camera inrelation to the electronic book.